Golden Mandarin (I)-A real-time Mandarin speech dictation machine for Chinese language with very large vocabulary
نویسندگان
چکیده
AhtractThis paper describes the first successfully implemented real-time Mandarin dictation machine developed in the world which recognizes Mandarin speech with very large vocabulary and almost unlimited texts for the input of Chinese characters into computers. Considering the special characteristics of the Chinese language, syllables are chosen as the basic units for dictation. The machine is speaker dependent, and the input speech is in the form of sequences of isolated syllables. The machine can be decomposed into two subsystems. The first subsystem is to recognize the syllables using hidden Markov models, in which special training algorithms and recognition approaches have been developed to recognize the 408 very confusing syllables (disregarding the tones), and special feature vectors have been used to recognize the five different tones including the very confusing neutral tone. But this does not help very much because every syllable can represent many different homonym characters and form different multi-syllabic words with syllables on its right or left. The second subsystem is then needed to identify the exact characters from the syllables and correct the errors in syllable recognition by first finding all possible word hypotheses and forming a word lattice for the sequence of recognized syllables through a lexical access process, and then obtaining the best path in the lattice with the maximum likelihood as the output sentence using a data-trained Markov Chinese language model. The real-time implementation is on an IBM PC/AT, connected to three sets of specially designed hardware boards on which seven TMS 320C25 chips operate in parallel. The preliminary test results indicate that it takes only about 0.45 s to dictate a syllable (or character) with an accuracy on the order of 90%. All techniques used in this machine are described and discussed in detail in this paper.
منابع مشابه
Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary but limited training data
This correspondence presents the first known results of complete recognition of continuous Mandarin speech for the Chinese language with very large vocabulary but very limited training data. Various acoustic and linguistic processing techniques were developed, and a prototype system of a continuous speech Mandarin dictation machine has been successfully implemented. The best recognition accurac...
متن کاملTangerine: a large vocabulary Mandarin dictation system
The text input for non-alphabetic languages, such as Chinese, has been a decades-long problem. Chinese Dictation using large vocabulary speech recognition provides a convenient mode of text entry. In contrast to a character based Dictation system [5], a word-based Mandarin dictation system has been designed [3] (based on Apple's PlainTalk speech recognition technology [4]) for efficient entry o...
متن کاملA multi-pass error detection and correction framework for Mandarin LVCSR
We previously proposed a multi-pass framework for Large Vocabulary Continuous Speech Recognition (LVCSR). The objective of this framework is to apply sophisticated linguistic models for recognition, while maintaining a balance between complexity and efficiency. The framework is composed of three passes: initial recognition, error detection and error correction. This paper presents and evaluates...
متن کاملThe Preliminary Results of a Mandarin Dictation Machine Based Upon Chinese Natural Language Analysis
This paper describes the preliminary results of the first research effort toward a Mandarin dictation machine in the world for the input of Chinese characters to computers. Considering the special characteristics of Chinese language, syllables are chosen as the basic units for dictation. The machine is divided into two subsystems. The first is to recognize the syllables using speech signal proc...
متن کاملWeb-based, Speech-enabled Games for Vocabulary Acquisition in a Foreign Language
In this thesis, I present two novel ways in which speech recognition technology might aid students with vocabulary acquisition in a foreign language. While research in the applied linguistics field of second language acquisition (SLA) increasingly suggests that students of a foreign language should learn through meaningful interactions carried out in that language, teachers are rarely equipped ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Trans. Speech and Audio Processing
دوره 1 شماره
صفحات -
تاریخ انتشار 1993